Search CORE

11 research outputs found

Faster data structures and graphics hardware techniques for high performance rendering

Author: Ganestam Per
Publication venue: 'Lund University Library'
Publication date: 01/01/2016
Field of study

Computer generated imagery is used in a wide range of disciplines, each with different requirements. As an example, real-time applications such as computer games have completely different restrictions and demands than offline rendering of feature films. A game has to render quickly using only limited resources, yet present visually adequate images. Film and visual effects rendering may not have strict time requirements but are still required to render efficiently utilizing huge render systems with hundreds or even thousands of CPU cores. In real-time rendering, with limited time and hardware resources, it is always important to produce as high rendering quality as possible given the constraints available. The first paper in this thesis presents an analytical hardware model together with a feed-back system that guarantees the highest level of image quality subject to a limited time budget. As graphics processing units grow more powerful, power consumption becomes a critical issue. Smaller handheld devices have only a limited source of energy, their battery, and both small devices and high-end hardware are required to minimize energy consumption not to overheat. The second paper presents experiments and analysis which consider power usage across a range of real-time rendering algorithms and shadow algorithms executed on high-end, integrated and handheld hardware. Computing accurate reflections and refractions effects has long been considered available only in offline rendering where time isn’t a constraint. The third paper presents a hybrid approach, utilizing the speed of real-time rendering algorithms and hardware with the quality of offline methods to render high quality reflections and refractions in real-time. The fourth and fifth paper present improvements in construction time and quality of Bounding Volume Hierarchies (BVH). Building BVHs faster reduces rendering time in offline rendering and brings ray tracing a step closer towards a feasible real-time approach. Bonsai, presented in the fourth paper, constructs BVHs on CPUs faster than contemporary competing algorithms and produces BVHs of a very high quality. Following Bonsai, the fifth paper presents an algorithm that refines BVH construction by allowing triangles to be split. Although splitting triangles increases construction time, it generally allows for higher quality BVHs. The fifth paper introduces a triangle splitting BVH construction approach that builds BVHs with quality on a par with an earlier high quality splitting algorithm. However, the method presented in paper five is several times faster in construction time

Lund University Publications

Auto-tuning Interactive Ray Tracing using an Analytical GPU Architecture Model

Author: Doggett Michael
Ganestam Per
Publication venue
Publication date: 01/01/2012
Field of study

This paper presents a method for auto-tuning interactive ray tracing on GPUs using a hardware model. Getting full performance from modern GPUs is a challenging task. Workloads which require a guaranteed performance over several runs must select parameters for the worst performance of all runs. Our method uses an analyti- cal GPU performance model to predict the current frame’s render- ing time using a selected set of parameters. These parameters are then optimised for a selected frame rate performance on the partic- ular GPU architecture. We use auto-tuning to determine parameters such as phong shading, shadow rays and the number of ambient oc- clusion rays. We sample a priori information about the current ren- dering load to estimate the frame workload. A GPU model is run iteratively using this information to tune rendering parameters for a target frame rate. We use the OpenCL API allowing tuning across different GPU architectures. Our auto-tuning enables the render- ing of each frame to execute in a predicted time, so a target frame rate can be achieved even with widely varying scene complexities. Using this method we can select optimal parameters for the cur- rent execution taking into account the current viewpoint and scene, achieving performance improvements over predetermined parame- ters

Lund University Publications

Power Efficiency for Software Algorithms running on Graphics Processors

Author: Akenine-Möller Tomas
Doggett Michael
Ganestam Per
Johnsson Björn M
Publication venue: Eurographics - European Association for Computer Graphics
Publication date: 01/01/2012
Field of study

Abstract in UndeterminedPower efficiency has become the most important consideration for many modern computing devices. In this paper, we examine power efficiency of a range of graphics algorithms on different GPUs. To measure power consumption, we have built a power measuring device that samples currents at a high frequency. Comparing power efficiency of different graphics algorithms is done by measuring power and performance of three different primary rendering algorithms and three different shadow algorithms. We measure these algorithms’ power signatures on a mobile phone, on an integrated CPU and graphics processor, and on high-end discrete GPUs, and then compare power efficiency across both algorithms and GPUs. Our results show that power efficiency is not always proportional to rendering performance and that, for some algorithms, power efficiency varies across different platforms. We also show that for some algorithms, energy efficiency is similar on all platforms

Lund University Publications

Explicit Cache Management for Volume Ray-Casting on Parallel Architectures

Author: Doggett Michael
Ganestam Per
Jönsson Daniel
Ropinski Timo
Ynnerman Anders
Publication venue: Eurographics - European Association for Computer Graphics
Publication date: 01/01/2012
Field of study

A major challenge when designing general purpose graphics hardware is to allow efficient access to texture data. Although different rendering paradigms vary with respect to their data access patterns, there is no flexibility when it comes to data caching provided by the graphics architecture. In this paper we focus on volume ray-casting, and show the benefits of algorithm-aware data caching. Our Marching Caches method exploits inter-ray coherence and thus utilizes the memory layout of the highly parallel processors by allowing them to share data through a cache which marches along with the ray front. By exploiting Marching Caches we can apply higher-order reconstruction and enhancement filters to generate more accurate and enriched renderings with an improved rendering performance. We have tested our Marching Caches with seven different filters, e. g., Catmul-Rom, B- spline, ambient occlusion projection, and could show that a speed up of four times can be achieved compared to using the caching implicitly provided by the graphics hardware, and that the memory bandwidth to global memory can be reduced by orders of magnitude. Throughout the paper, we will introduce the Marching Cache concept, provide implementation details and discuss the performance and memory bandwidth impact when using different filters

Lund University Publications

SAH guided spatial split partitioning for fast BVH construction

Author: Doggett Michael
Ganestam Per
Publication venue: 'Wiley'
Publication date: 27/05/2016
Field of study

We present a new SAH guided approach to subdividing triangles as the scene is coarsely partitioned into smaller sets of spatiallycoherent triangles. Our triangle split approach is integrated into the partitioning stage of a fast BVH construction algorithm, butmay as well be used as a stand alone pre-split pass. Our algorithm significantly reduces the number of split triangles comparedto previous methods, while at the same time improving ray tracing performance compared to competing fast BVH constructiontechniques. We compare performance on Intel’s Embree ray tracer and show that BVH construction with our splitting algorithmis always faster than Embree’s pre-split construction algorithm. We also show that our algorithm builds significantly improvedquality trees that deliver higher ray tracing performance. Our algorithm is implemented into Embree’s open source ray tracingframework, and the source code will be released late 2015

Lund University Publications

Real-Time Multiply Recursive Reflections and Refractions using Hybrid Rendering

Author: Doggett Michael
Ganestam Per
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

We present a new method for real-time render- ing of multiple recursions of reflections and refractions. The method uses the strengths of real-time ray tracing for objects close to the camera, by storing them in a per-frame constructed bounding volume hierarchy (BVH). For objects further from the camera, rasterization is used to create G-buffers which store an image-based representation of the scene out- side the near objects. Rays that exit the BVH continue tracing in the G-buffers’ perspective space using ray marching, and can even be reflected back into the BVH. Our hybrid renderer is to our knowledge the first method to merge real- time ray tracing techniques with image-based rendering to achieve smooth transitions from accurately ray-traced fore- ground objects to image-based representations in the back- ground. We are able to achieve more complex reflections and refractions than existing screen space techniques, and offer reflections by off-screen objects. Our results demonstrate that our algorithm is capable of rendering multiple bounce reflections and refractions, for scenes with millions of triangles, at 720p resolution and above 30 FPS

Lund University Publications

Bonsai: Rapid Bounding Volume Hierarchy Generation using Mini Trees

Author: Akenine-Möller Tomas
Barringer Rasmus
Doggett Michael
Ganestam Per
Publication venue: 'Williams College'
Publication date: 01/01/2015
Field of study

We present an algorithm, called Bonsai, for rapidly building bounding volume hierarchies for ray tracing. Our method starts by computing midpoints of the triangle bounding boxes and then performs a rough hierarchical top-down split using the midpoints, creating triangle groups with tight bounding boxes. For each triangle group, a mini tree is built using an improved sweep SAH method. Once all mini trees have been built, we use them as leaves when building the top tree of the bounding volume hierarchy. We also introduce a novel and inexpensive optimization technique, called mini-tree pruning, that can be used to detect and improve poorly built parts of the tree. We achieve a little better than 100% in ray-tracing performance compared to a "ground truth" greedy top-down sweep SAH method, and our build times are the lowest we have seen with comparable tree quality

Lund University Publications

Ray Classification for Accelerated BVH Traversal

Author: Ganestam Per
Müller Thomas
Pharr Matt
Wald Ingo
Publication venue: 'Wiley'
Publication date
Field of study

Crossref

Evaluation of the Clinical Practice of Shoulder Examination among Ten Experienced Shoulder Surgeons

Author: Attrup Mikkel L
Barfod Kristoffer W
Ganestam Ann
Hølmich Per
Publication venue
Publication date: 01/01/2015
Field of study

Copenhagen University Research Information System

Correct treatment of acute Achilles tendon rupture - the difficult choice

Author: Barfod Kristoffer W
Ebskov Lars Bo
Ganestam Ann
Hansen Tina S
Hølmich Per
Madsen Bjørn Lindegård
Publication venue
Publication date: 01/01/2015
Field of study

Copenhagen University Research Information System